The Difference and the Norm - Characterising Similarities and Differences Between Databases
نویسندگان
چکیده
Suppose we are given a set of databases, such as sales records over different branches. How can we characterise the differences and the norm between these datasets? That is, what are the patterns that characterise the general distribution, and what are those that are important to describe the individual datasets? We study how to discover these pattern sets simultaneously and without redundancy – automatically identifying those patterns that aid describing the overall distribution, as well as those pointing out those that are characteristic for specific databases. We define the problem in terms of the Minimum Description Length principle, and propose the DIFFNORM algorithm to approximate the MDL-optimal summary directly from data. Empirical evaluation on synthetic and real-world data shows that DIFFNORM efficiently discovers descriptions that accurately characterise the difference and the norm in easily understandable terms.
منابع مشابه
Modelling and Investigating the Differences and Similarities in the Volatility of the Stocks Return in Tehran Stock Exchange Using the Hybrid Model PANEL-GARCH
Efficient financial markets with high degree of transparency do not substantiate the hypothesis that there are differences in the volatility of return. Generally, there are factors rejecting any perfect similarity in the volatility of return in the emerging stock markets, as previous studies in Iran have confirmed the complete difference. On the other hand, the hybrid model PANEL-GARCH has the ...
متن کاملComparison between political action of Mohammad Mosaddegh and Jamal abd al_nasir encountering the colonization: similarities and differences
Iran and Egypt could consider as the main pioneers struggling against colonization in the Middle East and Islamic world in the first half of the twentieth century. Mohammad Mosaddegh and Jamal abd al_nasir, as pioneers of independence-seeking, by leading anti-colonization and liberal movements of their nation introduced a new form of struggle against colonization that eventually became an appro...
متن کاملStudy on the Contrast between Two Seismic Response Analysis Programs of Soil Layer
56 ground motions of the bedrock and surface are selected from 28 stiff sites ( site class I and site classⅡ) of the KiK-net station.The peak acceleration, response spectra and shear strain of actual hard sites are calculated by using SHAKE2000 and LSSRLI-1. The similarities and differences between SHAKE2000 and LSSRLI-1 and their differences from measured records are analyzed. It provides a ba...
متن کاملON THE EFFECTS OF ARA-A AND ARA-C ON X-RAY INDUCED DNA LESIONS IN NORMAL HUMAN AND A-T CELLS: SIMILARITIES AND DIFFERENCES.
A better understanding of the mechanism of chromosomal aberration formation could be obtained by using DNA repair inhibitors. Immortalized normal human (MRC 5 SVI) and ataxia telangiectasia ( AT 5 BIV A ) fibroblastic cell lines were treated with adenosine arabinoside (ara-A) and cytosine arabinoside (ara-C), both potent inhibitors of DNA dsb repair, alone or in combination with x-rays at ...
متن کاملDiscourse Markers in Political Interviews: A Contrastive Study of Persian and English
Due to the significance of multiculturalism in politics, and the central role linguistic devices play in organizing the political discourse, this text-based qualitative study was carried out to compare political interviews in the Iranian and English contexts to find out the probable similarities and differences in the use of discourse markers (DMs) between the two cultures. To this end, three s...
متن کامل